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FAST METHOD FOR RENEWAL AND ASSOCIATED 
RECOMMENDATIONS FOR MARKET BASKET ITEMS 



DESCRIPTION 
BACKGROUND OF THE INVENTION 

5 Field of the Invention 

The present invention generally relates to a computer method and 
system for placing orders for products over a computer network, such as the 
Internet, and more particularly, to a way to more effectively and efficiently 
determine a customer's preferences while the customer's choices are in 

10 progress in order to make recommendations of other items the customer might 
be interested in purchasing. More generally, further recommendations while a 
customer is making choices applies to any such situation, e.g., a customer 
makes a series of Internet surfing choices and new sites are dynamically 
recommended and displayed (by icons). Aside from virtual shopping carts, 

15 this can also apply to the real shopping cart with displays. As a customer fills 
thpxart, the display points to the next items the customer is likely to add to the 
cart. 

Background Description 

Shopping on the World Wide Web (WWW or simply the Web) portion 
20 of the Internet has become ubiquitous in our society. A typical Web site 

offering products for purchase employs what is referred to as a "market 
basket", a sort of virtual shopping cart without wheels. The customer selects 
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items to add to his or her market basket, and when he or she completes their 
shopping, a "check out" button is selected to process the items then in the 
market basket. 

A market strategy has developed which involves monitoring the items 

5 in the customer's market basket and, taking other factors into account 

including possibly the customer's past buying habits and similar choices made 
by other customers, making recommendations to the customer of other items 
he or she might be interested in purchasing. In the past decade, 
recommendations to a customer who has items in a market basket have been 

10 made using so called associative rules mined from the market basket data, or 

by several other means described, for example, in P. Resnick, N. Iacovou, M. 
Suchak, P. Berstrom and J. Riedl, "Grouplens: An open architecture for 
collaborative filtering of netnews", Proceedings of the ACM 1994 Conference 
on Computer Supported Cooperative Work, pp. 175-186, ACM, New York 

15 (1994), J. Breese, D. Heckerman, and C. Kadie, "Empirical analysis of 

predictive algorithms for collaborative filtering", Proceedings of Fourteenth 
Conference on Uncertainty in Artificial Intelligence, Morgan Kaufman, 
Madison, Wise. (1998), and others. The associative rules cannot be tailored to 
all possible partial market baskets. All the prior art in so called collaborative 

20 filtering technique require a substantial amount of computation. 



SUMMARY OF THE INVENTION 



It is therefore an object of the present invention to provide a more 
effective and efficient process for recommending items to a customer for their 
market basket in an e-commerce site. 
25 According to the invention, a new method is provided which is based 

on a novel theory that "not all items in the basket are selected because of their 
affinity with some other item already in the basket." The method uniquely 
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determines two separate components of item choice preferences: Preference 
by association with existing items in the basket in progress or independently 
exercised purchases. The former is the usual preference considered by all prior 
art methods. The latter is the renewal buying not considered by the prior art. In 
5 the present invention, these two preferences are separately estimated from the 

training data and combined in proper proportions to obtain the overall 
preference for each item not yet in the basket. The recommendations are 
presented in the form of ranking from which some subset of items at the top 
will be presented to the customer. The ranking is obtained from computed 
10 probabilities for each item that is not in the current basket, given the partial 

basket in progress. The method disclosed here is not restricted to purchasing 
of items. It can also be used for recommending new web-sites to someone 
browsing the Internet, 

BRIEF DESCRIPTION OF THE DRAWINGS 

15 The foregoing and other objects, aspects and advantages will be better 

understood from the following detailed description of a preferred embodiment 
of the invention with reference to the drawings, in which: 

Figure 1 is a table showing an array of binary data which represents 
items in market baskets; and 

20 Figure 2 is a flow diagram showing the logic of the computer 

implemented process according to the invention. 

DETAILED DESCRIPTION OF A PREFERRED 
EMBODIMENT OF THE INVENTION 

Referring now to the drawings, and more particularly to Figure 1, there 
25 is shown a table which illustrates a binary array which represents items in 
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market baskets. In this table, each row is a basket and each column represents 
an item. A binary value 1 in row i and column j signifies that the basket i 
contained item j and value 0 for the absence of the item. We shall denote such 
market basket data array as M, comprised of n baskets (rows) and m items 
5 (columns). 

The current partial basket is denoted as B, the content items of which 
is denoted as i i9 z 2 , . . i b9 where the number of items in the basket, b, can be 0 
if the basket is just beginning. Such case will be called a null basket, and the 
method to determine the preferences for the null basket will be separately 
10 described later. 

The probability of a customer buying item j given the partial basket B 
is P(/|B). The key concept is to separately consider the probability 
components: one due to associative buying, and the other due to an 
independent, or renewal choice. 

1 5 P(/|B) - P(/, asso | B) + P(/, renewal | B) 

= P(/'| asso, B) P(asso | B) + P(/ | renewal, B) P(renewal | B), (1) 

for all j not in B where, since one buys associatively or independently, 

P(asso | B) = 1 - P(renewal | B) (2) 

And in the case of renewal buy, the basket content is immaterial except for 
20 those items already in the partial basket B, and hence 

P(/ 1 renewal, B) = V(j | renewal) = P(/, renewal)/P(renewal) 

= P(renewal \j) P(/)/P(renewal), (3) 

where P(/) is the probability of item j being bought. 
YOR9-2000-0776US 
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Now we make a simple but reasonable assumption about the purchase 
behavior we name "single item influence". That is, whether the next buy is 
renewal or associative, it is determined as an aggregate of such tendency by 
the items in the current basket, singly. In other words, an associative next buy 
5 would be the result of its association to some one item in the basket and not 
because more than one item was needed for the association. We, likewise, 
assume each single item exerts its own tendency to non-associative, i.e., 
renewal, buying. These assumptions are reasonable and allow an efficient 
computation. 

10 We make further simplifying assumptions about the purchasing 

behavior regarding the aggregation of the single item influence. In the case of 
renewal, we reasonably assume that the least renewal tendency among all the 
basket items dictate the final renewal. So, for aggregating the renewal 
probabilities, 

1 5 P(renewal | B) = min^ P(renewal \i k \ for k = 1 , 2, . . ., 6, (4) 

which will be estimated from the data in a manner described below. And in 
the case of associated buying, we reasonably assume that maximum 
preference to associatively select an item j among each item in the partial 
basket B determines the overall preference for the item j. That is, in pre- 
20 normalized form, 

P'(/ | asso, B) = max* P(/ 1 asso, Q for k = 1, 2, . . ., b. (5) 

This quantity is set to zero for all items in the future partial basket for which 
recommendations are made. After that, they are normalized for probability, as 
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P(/ I asso,B) = JT 0 1 asso » B ) for all items;. (6) 
2jP'Cf I asso,B) 



Now, the probability, P(/ 1 asso, i k ), of equation (5) is equivalent to (using i for 



P(z,asso) P(z)P(asso | 0 



fa/,!) } (1-P(renewal | ;,/)} 
[P(0 J {l-P(renewal | i)} 



PQ- | Q {1-P(renewal iy,Q} (?) 
{l-P(renewal | /)} 



When the partial basket in progress is empty, i.e., the null basket at the start, a 
customer is at precisely the "renewal" point. Therefore, for null basket 
B = null, equation (1) is specialized by use of equation (3). 

10 P(/| null) = P(/ 1 renewal) (8) 

Now we describe sub methods to estimate P(/'), P(renewal), 
P(renewal \j,i}, and P(/ 1 i), etc. of the above equations from the data. 

• P(/) estimation: precomputed and stored in length m vector. 

Let the column sums of M be n x , n 2 , . . ., n k , . . ., n m . The probability of 
1 5 item j being bought is then as 
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m = ^- ( 9A ) 

k 



or optionally with a Laplace correction for small statistics as 



m = . ( 9B ) 



• P(renewal) estimation. 

Let the number of singleton baskets of item j be nj . This quantity is 

underestimated by the proportion of all singleton baskets to the total items 
purchased in the training data. The reason is that every time only one item was 
bought, it is certainly a case of renewal. The renewal probability is then 



P(renewal) = (10) 



10 • P(renewal | i) estimation: precomputed and stored in a length m vector. 

Given the item i is bought, the estimate of renewal probability is done 
in two stages. Let the total number of baskets where the item / is the singleton 

n! 

basket content be n! , then for — of the time, it is certain case of renewal, 

n 4 
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and for the remaining proportions, i.e., for 1- 



of the time, there are 



other items bought along with the item i y but some portion of it, which we 
estimate to be P(renewal), would be also renewal case. Therefore, the estimate 
is 



p (renewal | i) = 



+P(renewal)x 



1- 



(ID 



P(/ | renewal) computation: precomputed and stored in a length m 
vector. 

P(/ 1 renewal) is computed using the above estimated quantities and 
stored according to equation (3). 

10 • V(j 1 0 estimation: 

Let the subset of M that has 1 in i-th column be M ( ., i.e., those rows 
that have item i in the basket. The >th column sum of M i9 denoted as n Ji3 
represent the number of times j was bought along with i. Therefore, 



P(/ I 0 = ,andwefixP(i|i)tobeO (12) 



15 • P(renewal | j 9 i) estimation: 

From sub matrix M r above, the number of rows whose sum is exactly 2 
represents a certain case of renewal. Let nJ denote the number of rows whose 
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row sum in M I is exactly 2 and contains item j. The certain renewal porportion 
is rijl I n Jt . In the remaining cases, we estimate that the renewal is the same as 

P(renewal). So, 



P(renewal | jj) 



( ft ') 






(n..>) 


\ 


+P(renewal)x 


1- 






V J* ) 








) 



(13) 



P(/ 1 asso, z) computation: precomputed and stored in an m by m array 
or an equivalent sparce matrix representation. 

Using the estimate above, P(/ 1 asso, /) of equation (7) is computed and 



stored. 



♦ P(/ | asso, B) computation; 
10 First, we obtain P'(/ 1 asso, B) of equation (5) using equation (7) and 

the quantities developed above. Since the items already in the partial basket 
are not bought again, we fix it to zero whenever j is in B, Now, the normalized 
probability of / being purchased associated with the partial basket is 



P(/ | asso,B) = 



P'(/ 1 asso,B) 
J)P'(Jfc | asso,B) 



(14) 



15 • P(/ | renewal, B) = P(/ | renewal) normalization for partial basket B: 

The P(/ 1 renewal, B) = P(/ | renewal) of equation (3) is now fixed for 
those/ s that are already in the partial basket B to be zero, and normalized by 
dividing them by the sum over all /s before the final goal P(/ 1 B) is computed 
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from equation (1) using the partial quantities developed herewith. 

The final recommendation for items based on the current partial basket 
in progress is then in descending P(/ 1 B) ranking. The probability itself can be 
used for a direct gain maximization if the profit amount for each item is 

5 known. It that case, one would multiply the probabilities with the 

corresponding profit amount before ranking is made. More specifically, when 
each item's profit amount, $ ; ., is known, one computes P(/ 1 B)$ y and produces 
the ranking for recommendations based on this quantity. 

The process is illustrated in Figure 2. The method comprises three 

10 steps. The first two steps use the market basket information in the training 

data base 201. Specifically, in the first step 202, certain statistics are collected 
which are then used in the second step 203 to precompute certain quantities. 
The third step 204 uses the precomputed quantities, in the stored statistical 
model 205, and the partial market basket information 206 in an online manner 

15 to produce a preference ranking for the remaining unpurchased items. We 

assume the training data to contain n market baskets with m items. 

In more detail, the first step 202 is to collect statistics from the training 
data. This involves the following: 
(a) For each item j 9 obtain tij the number of baskets with item j purchased. 

20 (b) For each item j 9 obtain n! the number of baskets with j being the sole 

item purchased. 

(c) For each pair of items i and j 9 obtain the number of market baskets n Jt 
with items j and i purchased together. 

(d) For each pair of items i and j\ obtain the number of market baskets n.! 

25 with items / and j being the only two items purchased. 

The second step 203 is to precompute model parameters. This involves the 
following: 
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(a) Compute P(renewal) = -= — (equation (10)). 

k 



n. 

(b) For each item j\ compute = — (equation (9 A), or use 



equation (9B)). 

n! 

(c) For each item j, compute P(renewal | j) = — +P(renewal) 



(equation (11)). 
(d) For each item j 9 compute 

P'(/ | renewal) = P(renewal \j) x ^ (equation (3)). 

P(renewal) 



(e) For each pair of items i and j with * 0, compute P(/ | i) = — — 



(equation (12)). 

10 (f) For each pair of items / and j with n tj * 0, compute 



n' 

P(renewal | = + P(renewal) 

n jt 



(equation (13)). 



(g) For each pair of items i and j with w . * 0, compute 



Fff I asso.0 - Ft,' I 0 x <'-P(^ l/-0) (equation?)). 

(1-P(renewal | z)) 

The third step is to calculate a recommended ordering for a given partial 
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market basket. Given a partial basket B = z 2 , . . ij, let B be the 

complementary set of items not in B. Then 

(a) If B is empty, the sort items in order of decreasing P(/ 1 renewal) and 
return this as the item preference ordering. 

(b) If B is non-empty, then 

(i) Compute P(renewal | B) = min^ eB P(renewal | ij) (equation 

(4))- 

(ii) Compute the normalization factor P'(fc | renewal) , 



(iii) For each item j e B , compute 

m t%s * i i\ I renewal) 
10 P(/ | renewal) = _ v 1 — 

2^ P'(£ | renewal) 
*eB 



(iv) Compute the normalization factor ]jT) P'O' I asso,B) • 

(v) For each item/ e B , compute 

P'(/ | assoJB) = max, B P(/ | asso,/^) (equation (5)). 

(vi) For each item / e B, compute 

15 P(/ | asso,B) = 1 assoJB) ( equation (6 )). 

2^P'(fc I asso,B) 

JteB 

(vii) For each item / e B , compute 

P(/]B) = P(/ 1 asso 5 B)P(asso | B)+P(/ 1 renewal,B)P(renewal | B) 
(equation (1)). 

(viii) Sort items in order of decreasing P(/ 1 B) and return this as the 
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item preference ordering. 
One skilled in the art can utilize many techniques to reduce the storage 
requirement to process the present invention when the number of items is very 
large: reduced accuracy for probabilities, sparce matrix storing techniques, and 
5 clustering of like items to reduce the number of items, which can be later 
refined for the cluster members after the cluster preferences are computed. 

While the invention has been described in terms of a single preferred 
embodiment, those skilled in the art will recognize that the invention can be 
practiced with modification within the spirit and scope of the appended 
10 claims. 
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